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The density function of the limiting spectral distribution of gen- 
eral sample covariance matrices is usually unknown. We propose to 
use kernel estimators which are proved to be consistent. A simulation 
study is also conducted to show the performance of the estimators. 

1. Introduction. Suppose that Xij are independent and identically dis- 
tributed (i.i.d.) real random variables. Let X n = (Xij) pxn and T n be a p x p 
nonrandom Hermitian nonnegative definite matrix. Consider the random 
matrices 

A — — T X / 2 X y^'t 1 / 2 

n 

When EXu = and EX\ X = 1, A n can be viewed as a sample covariance 
matrix drawn from the population with covariance matrix T n . Moreover, if 
T n is another sample covariance matrix, independent of X ra , then A n is a 
Wishart matrix. 

Sample covariance matrices are of paramount importance in multivariate 
analysis. For example, in principal component analysis, we need to estimate 
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eigenvalues of sample covariance matrices in order to obtain an interpretable 
low-dimensional data representation. The matrices consisting of contempo- 
rary data are usually large, with the number of variables proportional to 
the sample size. In this setting, fruitful results have accumulated since the 
celebrated Marcenko and Pastur law [8] was discovered; see the latest mono- 
graph of Bai and Silverstein [4] for more details. 

The basic limit theorem regarding A n concerns its empirical spectral dis- 
tribution F An . Here, for any matrix A with real eigenvalues, the empirical 
spectral distribution F is given by 



F A (x) = -j2^k<x), 



where k = l,...,p, denote the eigenvalues of A. 

Suppose the ratio of the dimension to the sample size c n = p/n tends 
to c as n — > oo. When T n becomes the identity matrix, F An tends to the 
so-called Marcenko and Pastur law with the density function 



fc(x) 



(2ircx) 1 y / (b — x)(x — a), a<x<b, 
0, otherwise. 

It has point mass 1 — c _1 at the origin if c > 1, where a = (1 — \fc) 2 and 
b = (1 + t/c) 2 ( see Bai and Silverstein [4]). 
In the literature, it is also common to study 

_ 1 T 
B n = — X T n X n 

n 

since the eigenvalues of A n and B n differ by \n — p\ zero eigenvalues. Thus, 

(1.1) F Bn (x) =(l- -)I(x G [0,oo)) + -F A "(x). 

\ n J n 

When F Tn converges weakly to a nonrandom distribution H, Marcenko 
and Pastur [8], Yin [16] and Silverstein [13] proved that, with probability 
one, F Bn (x) converges in distribution to a nonrandom distribution function 
F_ cH (x) whose Stieltjes transform m(z) = nT-F cH (z) is, for each z G C + = 
{z G C : Qz > 0}, the unique solution to the equation 



(1.2) m 



V J l + tmj 



Here, the Stieltjes transform m,F{z) for any probability distribution function 
F(x) is defined by 

(1.3) m F (z)= l—^—dF(x), zeC + . 

. x - z 
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Therefore, from (1.1), we have 

(1.4) Ec,h( x ) = t 1 - c ) 7 ( x G [°» °°)) + cF c,h(x), 

where F c jj{x) is the limit of F An (x). As a consequence of this fact, we have 

1 - c 

(1.5) m(z) = \-cm(z). 

Moreover, miz) has an inverse, 

, \ , s 1 f tdH(t) 

(1.6) z(m) = + c ri 1 



m J 1 + t m 

Relying on this inverse, Silverstein and Choi [14] carried out a remarkable 
analysis of the analytic behavior of F_ c H (x) . 

When T n becomes the identity matrix, there is an explicit solution to (1.2). 
In this case, from (1.1), we see that the density function of F_ cH {x) is 

l cJ (x) = (l-c)I(c<l)8 + cf c (x), 

where 5q is the point mass at 0. Unfortunately, there is no explicit solution 
to (1.2) for general T n . Although we can use F An \x) to estimate F Cj h(%), we 
cannot make any statistical inference on F C) h{x) because there is, as far as 
we know, no central limit theorem concerning (F An (x) — F c> h{%))- Actually, 
it is argued in Bai and Silverstein [4] that the process n(F An (x) — F C! h(x)), 
x G (—00,00), does not converge to a nontrivial process in any metric space. 
This makes us want to pursue other ways of understanding the limiting 
spectral distribution F c ^h{x). 

This paper is part of a program to estimate the density function fc,H( x ) 
of the limiting spectral distribution F c ^h{x) of sample covariance matrices 
A n by kernel estimators. In this paper, we will prove the consistency of those 
estimators as a first step. 

2. Methodology and main results. Suppose that the observations X±, ... , 
X n axe i.i.d. random variables with an unknown density function fix) and 
F n ix) is the empirical distribution function determined by the sample. A pop- 
ular nonparametric estimate of f(x) is then 

1 ( ' x ~ Xj\ 1 I x ~V 



where the function Kiy) is a Borel function and h = h(n) is the bandwidth 
which tends to as n — > 00. Obviously, fnix) is again a probability den- 
sity function and, moreover, it inherits some smooth properties of Kix), 
provided the kernel is taken as a probability density function. Under some 
regularity conditions on the kernel, it is well known that f n (x) —> fix) in 
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some sense (with probability one, or in probability). There is a huge body 
of literature regarding this kind of estimate. For example, one may refer to 
Rosenblatt [10], Parzen [9], Hall [7] or the book by Silverman [12]. 

Informed by (2.1), we propose the following estimator f n (x) of f c ,H(x). 



(, 2 ) M^p^.yK^L)^), 

where Hi, i = 1, . . . ,p, are eigenvalues of A n . It turns out that f n (x) is a 
consistent estimator of f c ,H(%) under some regularity conditions. 
Suppose that the kernel function K{x) satisfies 

(2.3) sup \K(x)\ <oo, lim \xK(x)\=0 

-oo<x<oo |x|-s-oo 

and 

(2.4) J K{x) dx = 1, J \K'(x)\dx<oo. 

Theorem 1. Suppose that K(x) satisfies (2.3) and (2.4)- Let h = h(n) 
be a sequence of positive constants satisfying 

(2.5) lim nh 5/2 = oo, lim h = 0. 



n— >oo 



Moreover, suppose that all Xij are i.i.d. with EXu = 0, Var(Xn) = 1 and 
EX\f < oo. Also, assume that c n — >• c € (0, 1). Let T n be a p x p nonrandom 
symmetric positive definite matrix with spectral norm bounded above by a 
positive constant such that H n = F Tn converges weakly to a nonrandom 
distribution H. In addition, suppose that F c< h(x) has a compact support 
[a, b] with a > 0. Then, 

f n (x) — > fc,H(x) in probability uniformly in x G [a, b]. 

Remark 1. We conjecture that the condition EX\f can be reduced to 
EX^ < oo. 

When T n is the identity matrix, we have a slightly better result. 

Theorem 2. Suppose that K(x) satisfies (2.3) and (2.4). Let h = h(n) 
be a sequence of positive constants satisfying 

(2.6) lim nh 2 =oo, lim h = 0. 

n— >oo n— >oo 

Moreover, suppose that all X^ are i.i.d. with EXu = 0, Var(Xn) = 1 and 
EX\l < oo. Also, assume that c n — ?■ c £ (0, 1). Denote the support of the MP 
law by [a,b]. Let T n = I. Then, 

sup 0*0 — fc(x)\ — > in probability. 

x£[a,b] 

Theorem 1 also gives the estimate of F C) h{x), as below. 
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Corollary 1. Under the assumptions of Theorem 1, correspondingly, 

(2.7) F n (x) — > F C) h(x) in probability, 
where 

(2.8) F n {x)= f f n (t)dt. 

J — oo 

Corollary 1 and the Helly-Bray lemma ensure that we have the following. 

Corollary 2. Under the assumptions of Theorem 1, if g(x) is a con- 
tinuous bounded function, then 

(2.9) J g{x) dF n (x) — > J g(x)dF Ct H{x) in probability. 

In order to prove consistency of the nonparametric estimates, we need 
to develop a convergence rate for F An . When T n = I, Bai [1] developed 
a Berry-Esseen-type inequality and investigated the convergence rate of 
EF An . Later, Gotze and Tikhomirov [6] improved the Berry-Esseen-type 
inequality and obtained a better convergence rate. For general T n , we es- 
tablish the following convergence rate. 

Theorem 3. Under the assumptions of Theorem 1, 

(2-10) snp\EF A -(x)-F^ Hn (x)\=o(-^\ 

and 

(2.H) Esup\F A "(x)-F Cn)Hn (x)\=o(^y 

Remark 2. Under the fourth moment condition, that is, EXf^ < oo, we 
conjecture that the above rate 0(n -2 / 5 ) could be improved to 0(n -1 \/logn). 

3. Applications. Let us demonstrate some applications of Theorems 1, 2 
and their corollaries. Since F C) h{x) does not have an explicit expression 
(except for some special cases), we may now use F n (x) to estimate it, by 
Corollary 1. More importantly, F n (x) has some smoothness properties, which 
F An does not have. 

We first consider an example in wireless communication. Consider a syn- 
chronous CDMA system with n users and processing gain p. The discrete- 
time model for the received signal Y is given by 

n 

(3.1) Y = ]Tx, : h fc + W, 

k=l 
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where x% £ TZ and € 1Z P are, respectively, the transmitted symbol and 
the signature spreading sequence of user k, and W is the Gaussian noise 
with zero mean and covariance matrix a 2 I. Assume that the transmitted 
symbols of different users are independent, with Ex^ = and E\xk\ 2 = Pk- 
This model is slightly more general than that in [15], where all of the users' 
powers pk are assumed to be the same. 

Following [15], consider the demodulation of user 1 and use the signal-to- 
interference ratio (SIR) as the performance measure of linear receivers. The 
SIR of user 1 is defined by (see [15]) 

1 cf Cl a2 + £f =2 (cfh fc )V 

The minimum mean square error (MMSE) receiver minimizes the mean 
square error as well as maximizes the SIR for all users (see [15]). The SIR 
of user 1 is given by 

/3r SE =Pihf(H 1 D 1 H? , + ( r 2 I)- 1 h 1 , 

where 

Di =diag(p 2 ,---,Pn)> H i = (h 2 ,...,h n ). 

Assume that the h' k are i.i.d. random vectors, each consisting of i.i.d. random 
variables with appropriate moments. Moreover, suppose that p/n — > c > 
and -F Dl (x) — > H(x). Then, by Lemma 2.7 in [2] and the Helly-Bray lemma, 
it is not difficult to check that 

J x + o* 

To judge the performance of different receivers, we may then compare the 
value of f x _r a n dF Cj u(x) with the limiting SIR of the other linear receiver. 
However, the awkward fact is that we usually do not have an explicit ex- 
pression for F Cj h(x). Thus, we may use the kernel estimate f x _^^ dF n (x) 
to estimate f x _^ ai dF Ci H (x) , by Corollary 2. 

A second application: we may use f n (x) to infer, in some way, some statis- 
tical properties of the population covariance matrix T n . Specifically speak- 
ing, by (1.3), we may evaluate the Stieltjes transform of the kernel estimator 

fn(x), 

(3.2) m fn (z) = j-—f n ( x )dx, zeC + . 

We may then obtain m*(z), by (1.5). On the other hand, we conclude 
from (1.6) that 
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Note that m{z) has a positive imaginary part. Therefore, with notation 
z\ = —l/rn(z) and s(z\) = — ( 2 )( c ~^~ 2 ^ ( 2 )) ; we can rewrite (3.3) as 

(3.4) a ( Zl)= (vm i zie c+. 



t — Z\ 

Consequently, in view of the inversion formula 

1 f b 

(3.5) F{[a, &]} = — lim / ^sm,F{u + iv) du, 

vr v^o J a 

we may recover H(t) from s{z\) as given in (3.4). However, s{z\) can be 
estimated by the resulting kernel estimate 

(36 ) m fn (z)(c - 1 - zm fn (z)) 

Once H(t) is estimated, we may further estimate the functions of the pop- 
ulation covariance matrix T n , such as -trT^. Indeed, by the Helly-Bray 
lemma, we have 



-UrT2= j t 2 dH n {t) A J t 2 dH(t). 



Thus, we may construct an estimator for -trT^ based on the resulting 
kernel estimate (3.6). We conjecture that the estimators of H(t) and the 
corresponding functions like ^trT^, obtained by the above method, are 
also consistent. A rigorous argument is currently being pursued. 

4. Simulation study. In this section, we perform a simulation study to 
investigate the behavior of the kernel density estimators of the Marcenko and 
Pastur law. We consider two different populations, exponential and binomial 
distributions. From each population, we generate two samples with sizes 
50 x 200 and 800 x 3200, respectively. We can therefore form two random 
matrices, (-Xjj^o^oo an d (Ajj)8oo,3200- ^ ne kernel is selected as 

K(x) = (2vr)- 1 /2 e - 2 /2 5 

which is the standard normal density function. The bandwidth is chosen as 
h = 0.5n~ 1 / 3 (n = 200, 3200). 

For (Xjj)5o,200 5 the kernel density estimator is 

50 



50 x 200~ 2 / 5 



^((z- W )/200- 2 / 5 ; 



i=l 



where //j, i = 1, . . . , 50, are eigenvalues of 200 ^Aij ^0,200 PQj)^ 200- This 
curve is drawn by dot-dash lines in the first two pictures. 
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0.5 



MP Law 

50*200 

800*3200 



1.0 



1.5 



2.0 



Fig. 1. Spectral density curves for sample covariance matrices n 1 (-Xy)pxn(-Xy)p> 



Xi 



exponential distribution. 



For (^Qj)800,3200) the kernel density estimator is 

800 

■^K((x-/i,)/3200- 2 / 5 ; 



800 x 3200" 2 / 5 



i=l 



where m, i = 1,.. . ,800, are eigenvalues of 3200 1 (^j)800,320o(^j) 8 oo,3200- 
This curve is drawn by dashed lines in the first two pictures. 

The density function of the Marcenko and Pastur law is drawn by solid 
lines in the first two pictures. Here, in Figure 1, the distribution is 

(4.1) F{x) = e- (x+1 \ x>-l. 
In Figure 2, the distribution is 

(4.2) P(X = -l) = l/2, P(X = l) = l/2. 

From the two figures, we see that the estimated curves fit the Marcenko 
and Pastur law very well. As n becomes large, the estimated curves become 
closer to the Marcenko and Pastur law. 
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MP Law 

50*200 
800*3200 




I I I I 

0.5 1.0 1.5 2.0 

x 

FlG. 2. Spectral density curves for sample covariance matrices n" 1 (-Xy)pxn(-Xy)px 
Xn ~ binomial distribution. 



Finally, we consider the estimated density curves based on the following 
three matrices: 

1 rpl/2-y ~Y~T rpl/2 

200 — 200 1 200 ^50x200 ^50x200 1 200' 

1 1/2 T 1/2 

A 3 200 = 3200 T 800 X 800x 3200 X 800x 3200 T 3200 , 

A _ 1 rpl/2 y vT rpl/2 

A 6400 — 6400 1 6400^1600x6400^1600x6400 1 6400' 

where X pX 4 p , p = 50, 800, 1600, are px Ap matrices whose elements are i.i.d. 
random variables with distribution (4.1), and T n = ^Y pX 4 p Yj x4p . Here, 
Y pX 4p is a p x 4p matrix consisting of i.i.d. random variables whose dis- 
tributions are given by (4.2). T n and X pX 4p are independent. The kernel 
function is the same as before. The bandwidths corresponding to the three 
matrices are 0.5 x (Ap)" 1 / 3 . In Fi gure 3, we present three estimated curves. 
The dot-dash line is based on A20O) the dashed line on A3200 and the solid 
line on Ag4oo- Although, in this case, we do not know its exact formula, we 
can predict the limiting spectral density function from Figure 3. 
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1600*3200 

50*100 
800*1600 



1 2 3 4 5 



Fig. 3. Spectral density curves for sample covariance matrices 
n~ 1 Tn / ' 2 (Xij) pX n(Xij)p Xn Tl/ 2 , Xij ~ exponential distribution, T n = 
n~ {Yij) p xn{Yij)pxn ; Yij ~ binomial distribution. 



In order to show that the above conclusion is reliable, we choose ten 
points throughout the range and calculate the mean square errors (MSEs) 
for the kernel density estimator at the selected ten points, based on 500 
matrices, 

500 

MSE(x) = 500" 1 ~ 
i=i 

where fn (%) is the kernel density estimator at x based on the ith. matrix. If 
the limiting distribution is unknown as in the case A200 1 we use the averaged 
spectral density 

500 

/ c (z) = 500"^ /«(*). 
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Table 1 

MSE of spectral density curves for sample covariance matrices 
n _1 (Xij)p X n(Xij) pxn , Xij ~ exponential distribution 



X = 


0.30 


0.511 


0.722 


0.933 


1.144 


50 x 200 


9.89e-2 


3.21e-2 


3.18e-2 


3.25e-2 


3.56e-2 


800 x 3200 


3.84e-03 


7.44e-5 


7.28e-5 


7.67e-5 


7.34e-5 


x = 


1.356 


1.567 


1.778 


1.989 


2.20 


50 x 200 
800 x 3200 


3.79e-2 
7.67e-5 


3.18e-2 
7.23e-5 


3.73e-2 
6.88e-5 


2.76e-2 
6.60e-5 


3.63e-2 
6.74e-5 



So, in this case, 

500 

MSE(x) = 500" 1 - f c {x)f. 

The numerical results for the three different matrices considered in this 
section are presented in Tables 1, 2 and 3. The notation "e— j" in these 
tables means multiplication by lO - -'. The MSEs are uniformly small. As n 
becomes large, the MSEs become smaller. This supports the conclusion that 
our proposed kernel spectral density curve is consistent. 

We also conducted simulations using a wide range of bandwidths from 
small h = n~ l l 2 to large h = re" 1 / 10 . The kernel spectral density curves seem 
to change rather slowly. This indicates that the kernel spectral density esti- 
mator is robust with respect to the bandwidth selection. 

5. Proofs of Theorems 1 and 2. Throughout this section and the next, 
to simplify notation, M,Mi, . . . ,M\2 stand for constants which may take 
different values from one appearance to the next. 



Table 2 

MSE of spectral density curves for sample covariance matrices 
n~ (Xij) pX n{Xij) pxrl , Xij ~ binomial distribution 



X = 


0.30 




0.511 


0.722 


0.933 


1.144 


50 x 200 


3.23e- 


1 


3.14e-2 


2.38e-2 


2.76e-2 


2.86e-2 


800 x 3200 


5.13e- 


03 


8.01e-5 


6.05e-5 


7.30e-5 


6.53e-5 


x = 


1.356 




1.567 


1.778 


1.989 


2.20 


50 x 200 


2.70e- 


2 


2.44e-2 


2.42e-2 


2.40e-2 


1.69e-2 


800 x 3200 


6.28e- 


5 


7.65e-5 


6.14e-5 


6.68e-5 


1.13e-4 
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Table 3 

MSE of spectral density curves for sample covariance matrices 
n~ 1 Tn'" 2 (Xij) pxn (Xij)p Xn Tl/ 2 , Xij ~ exponential distribution 
T n = n" 1 (Yij)p X n(Yij) pxn , Yij ~ binomial distribution 
n^ 1 (Xij) pX n{Xij) p><n , Xij ~ binomial distribution 



X = 


0.30 




0.511 


0.722 


0.933 


1.144 




50 x 100 


1.20e- 


2 


8.71e-3 


8.58e-3 


7.90e-3 


8.77e- 


3 


800 x 1600 


6.25e- 


05 


4.00e-5 


3.51e-5 


3.19e-5 


2.71e- 


5 


1600 x 3200 


2.98e- 


5 


1.83e-5 


1.44e-5 


1.39e-5 


1.53e- 


5 


a; = 


1.356 




1.567 


1.778 


1.989 


2.20 




50 x 200 


7.91e- 


3 


8.07e-3 


8.34e-3 


7.54e-3 


7.17e- 


3 


800 x 3200 


3.04e- 


5 


3.10e-5 


2.98e-5 


2.89e-5 


2.66e- 


5 


1600 x 3200 


1.19e- 


5 


1.19e-5 


1.36e-5 


1.29e-5 


1.32e- 


5 



5.1. Proof of Theorem 1. We begin by developing the following two lem- 
mas, necessary for the argument of Theorem 1. 

Lemma 1. Under the assumptions of Theorem 1, let F Cn H n (t) be the 
distribution function obtained from F Cj H(t) by replacing c and H by c n and 
H n , respectively. Furthermore, f Cn ,H n (x) denotes the density of F Cn: H n (x). 
Then, 

svpf Cn ,H n ( x ) < M - 



PROOF. From (3.10) in [2], we have 

where m n = m n (z) = EkF c h ( z )- Based on this expression, conclusions simi- 
lar to those in Theorem 1.1 of [14] still hold if we replace F Ci h(x) by F Cni H n (x) 
and then argue similarly with the help of [14]. For example, the equality (1.6) 
in Theorem 1.1 of [14] states that 

(5.2) ' +t f tdm 



m(x) J 1 + tmjx) 

Similarly, for every x ^ for which f Cn ,H n {x) > 0, ^f Cn ,,H n {x) is the imagi- 
nary part of the unique m n (x) satisfying 

(5.3) x — J— — + Cn f tiH ^ 



m n {x) J l + tm n (x) 
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Now, consider the imaginary part of m n (x). From (5.3), we obtain 

t 2 dH n (t) 1 



^ 5 ' 4 ^ " n 7 \l+tm n (x)\ 2 \m n (x)\ 2 ' 

It follows from (5.3), (5.4) and Holder's inequality that 

dH n (t) 



\m n (x)\ < 



< 



< 



\c n - 1| 


+ 




X 


X 


\c n ~ 1 


+ 


Cn 


X 


X 



+ 



l + tm n (x)\ 

t 2 dH n {t) 
|1 + tm n {x)\ 2 

dH n (t) 
t 2 



dH, 



n(0\ 



1/2 



t 2 



1/2 



x\m n (x)\ 

where J dH ^ is well defined because we require the support of F C) h{x) to 
be [a, b] with a > 0. This inequality is equivalent to 

lM„ W | 2 <^N„MI + ^(/^ t,Xl/2 

It follows that 

(5.5) sup |m n (x)| < M. 

n,x 

This leads to sup n ^ x f Cn ,H n (x) < M. □ 

Lemma 2. Under the assumptions of Lemma 1, when have 
(5-6) fc n ,H n {x n ) ~ fc,H{Xn) -»• 0. 

Proof. Obviously, f c ,H(%n) — Ic,h{x) — > because f c ,H(x) is continuous 
on the interval [a, b\. Moreover, in view of (5.5), we may choose a subsequence 
rafc so that rn nk {x nk ) converges. We denote its limit by a(x). Suppose that 
Q(a(x)) > 0. Then, as in Lemma 3.3 in [14], we may argue that the limit 
of m n (x n ) exists as n — > oo. Next, we verify that a(x) = m(x). By (5.3), we 
then have 



1 



x ■ 



a(x) 



+ c 



tdH(t) 
l + ta(x) 



because, via (5.4) and Holder's inequality, 



tdH n {t) 
1 + tm n {x) 

< \m n {x) — a(x 



tdH n {t) 



1 + ta(x) 
1 



t 2 dH n (t) 
Cn\m n ( x )\ 2 J \l + ta(x)\' 



1/2 
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and 



tdH n {t) 
l + ta(x) 



tdH(t) 
1 + ta(x) 



Since the solution satisfying the equation (5.2) is unique, a(x) = m(x). 
Therefore, rn n {x) -^rn{x), which then implies that 



(5.7) 



fc n ,H n (%n) -fc,H( x ) 



Now, suppose that Q(a(x)) = 0. This implies that 9(m n ( x n)) — > and then 
that f Cn ,H n (x n ) — > because if there is another subsequence on which 
$s(m n (x„)) converges to a positive number, then m n (x n ) must converge 
to the complex number with the positive imaginary part, by the previous 
argument. Next, by (1.2) and (5.1), 3(m n (i„ + iv)) — Q(m(x n + iv)) — > 
for any v > 0. We may then choose v n — > so that 9(m n (i n . + %)) — 
^s(m(x n + iv n )) — > as n — > oo. Moreover, Q(m(x n + iv n )) — > ^s (m (x)) and 
^$(rn n (x n + iv n )) — £s(m n (x)) — > by Theorem 1.1 of [14] and a theorem for 
m n (z) similar to Theorem 1.1 of [14]. Therefore, in view of the continuity of 
m n (x) for x / 0, ^s(m(x)) = and then (5.6) holds for the case Q(a(x)) = 0. 
□ 



We now proceed to prove Theorem 1. First, we claim that 



(5.8) 



sup 



in probability. Indeed, from integration by parts and Theorem 3, we obtain 



E'sup 

X 



h 



h 



dF An (t) -- I K 



h 



dF Cn , Hn (t) 



Esup 



Esup 



1 



K'(u)(F A " (x - uh) - F CntHn (x - uh)) du 



<±-E S up\F A "(x) - F Cn)Hn {x)\ J \K'{u)\du 



< 



M 



0. 



n 2 / 5 h 

The next aim is to show that 
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uniformly in x £ [a, £>]. This is equivalent to, for any sequence {x n ,n > 1} in 
[a, b] converging to x, 



(5.9) 



K ( u )(fc n ,H„(Xn ~ Uh) - f CtH (x n ~ uh)) du 



0. 



From Theorem 1.1 of [14], f c ,H{x) is uniformly bounded on the interval [a, b]. 
Therefore, (5.9) follows from the dominated convergence theorem, Lemma 1 
and Lemma 2. 
Finally, 



K 



x — t 



dF C}H (t) - f CiH (x) 



K 



x—b 



dt 



< sup 

x£[a,b] J\t\>6 



(fc,H(x-t)-f c>H (x))-Kl-)dt 



(fcA*-t)-fc,H(x))j;K 



di 



+ sup 

xe[a,b] J\t\<6 



tf c , H {x-t)-f CiH {x))-Kl- 



dt 



<2 sup f C)H (x) [ \K(y)\dy 
xe[a,b] J\t\>5/h 

+ SUp SUp \f CtH (x-t) -f Cs H(x)\ [- 
xe\a,b] \t\<5 J n 



K 



dt. 



ce[o,6] \t\< 

which goes to zero by fixing 5 and letting n — > oo first, and then letting 
6 — > 0. On the other hand, obviously, 



x—b 



K[ - ) dt 



(x—a)/h 



{x-b)/h 



K{t) dt 



+ CXD 



K{t)dt = l. 



Thus, the proof is complete. 



5.2. Proof of Theorem 2. Denote by F Cn (t) the distribution function ob- 
tained from F c (t) = f_ oo f c (x)dx with c replaced by c n . Let S„ = ^X n X^. 
From integration by parts, we obtain 



5' 



x — t 



ffx — t 



(F s "(i)-F c „(())* 



- h lKi(«)(FS*(*-«h)-F^-uk))<i« 
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<^sup\F Sn (x)-F Cn (x)\ J \K'{u)\du 



< 



M 



where the last step uses Theorem 1.2 in [6]. We next prove that 



sup 

X 



5' 



x — t 



dF c (t) 



0. 



It suffices to prove that 

(5.10) sup|/ Cn (x)-/ c (x)|^0, 

x 

where f Cn (x) stands for the density of F Cn (x). 
Note that when c < 1 , 



fcn(x) - f c (x) 
where 



y/(x- a(c n )){b(c n ) - x) y/(x- a{c)){b(c) - x) 



27TC n X 



2ttcx 



a(c) = (l-^) 2 , 6(c) = (l + v^) 2 , 
and a(cn) and b(c n ) are obtained from a(c) and 6(c) by replacing c with 
c n , respectively. It is then a simple matter to verify that (5.10) holds for 
x S [a(c),b(c]]. 

Finally, as in Theorem 1, one may prove that 



sup 

X 



X — t 



dF c (t) - f c (x) 



0. 



Thus, the proof is complete. 

5.3. Proof of Corollary 1. The result follows from Theorem 1 in [11]. 
6. Proof of Theorem 3. 



6.1. Summary of argument. The strategy is to use Corollary 2.2 and 
Lemma 7.1 in [6]. To this end, a key step is to establish an upper bound 
for |&i|, defined below. Note that in a suitable interval for z with a well- 
chosen imaginary part v, the absolute value of the expectation of the Stieltjes 
transform of F& n , \Em n (z)\, is bounded. Moreover, for such v, when n — > oo, 
the difference between b\ and its alternative expression involving Em n (z), 
p n [given in (6.13)], converges to zero with some convergence rate. Therefore, 
we may argue that |&i| is bounded. Once this is done, we further develop a 
convergence rate of m n (z) — Em n (z) using a martingale decomposition, and 
a convergence rate of the difference between Em n (z) and its corresponding 
limit using a recurrence approach. 

We begin by giving some notation. Define A(z) = A n — zl, Aj(z) = 
A(z) — sjsj and Sj = T^Xj, with Xj being the jth. column of X n . Let 
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Ej = E(-\s\, . . . ,Sj) and let Eq denote the expectation. Moreover, introduce 
1 a 1 



■3 



m n (z) 



m„[z 



1 + sjA~ 1 (z)sj ' Pj 1 + n-i tr T n Aj\z) ' 
r, = sjAj\ Z ) Sj - itr A-(,)T n , 6, = ^ZT^T^ 
dF An (x) o /" dF CntHn (x) 



a; — z 

^B„(X) 
X — Z 



m°(z) 



x — z 
x — z 



and 



6 = sfA^CzJsi - i^tr Aj" 1 (z)T n . 



Here, F Cn h„( x ) is obtained from F_ cH (x) by replacing c and i/ by c n and 
if n , respectively. 

Let A n = supj. |EF An (x) — F CntHn (x)\ and u = max{7A n , Min~ 2 / 5 } with 
< 7 < 1 to be chosen later and Mi an appropriate constant. As in Lemma 3.1 
and Lemma 3.2 in [14], we obtain, for u £ [a, b] and Vo < v < 1, 

(6.1) |ro£(*)|<M, R(z)|<M, 

where the bound for |m^(z)| is obtained with the help of (1.5). Using inte- 
gration by parts, we have, for v > vq, 



\Em n {z) - m° n {z)\ 



/+oo I 
d{EF^{x)-F CniHn {x)) 
-oo ^ * 

EF A -(x)-F Cn>Hn (x 



oo 
oo 



■ dx 



TrA n ir 
< - < -. 



) (x - z) 2 

This implies that 

(6.2) \Em n (z)\ < M, \EmJz)\ < M, 

where the bound for \Em n (z)\ is obtained from an equality similar to (1.5), 
noting that 5iz > a. It is readily observed that \(3j \ and \/3j\ are both bounded 
by \z\/v (see (3.4) in [2]) and that Lemma 2.10 in [2] yields 

(6.3) \(3 jS jA- 2 (z) Sj \<v-\ 
which gives 

(6.4) | tr(A - zl)- 1 - tr(A fc - zl)" 1 ] < iT 1 . 
This, together with (6.2), gives, for v > Vq, 



(6.5) 



n 



< M. 
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In the subsequent subsections, we will assume that z = u + iv with v > vq 
and u G [a, b] . 

6.2. Bounds for n~ 2 E\ tr A' 1 (z) - Etr A' 1 (z)\ 2 and E\(3i\ 2 . 
Lemma 3. If \bx\ < M, then, for v > M\n~ 2 ^ , 
(6.6) 



1 r S|trA- 1 (z)-^trA- 1 (^)| 2 < 



n- 



n 2 v 3 



Proof. 
1 



n 



trA _1 (z) -EtrA -1 ^) 



1 n 

- J3(^trA -1 («) -£■_,■_! trA -1 (;z)) 
1 n 

- J>,(tr A- 1 ^) - A-\z)) - E^ x t^A-\z) - A~\z)) 
n * — J J 

i=i 

n 



3=1 



i n r / 

.7=1 L V 



r 2 (z) Si --trA- 2 (z)T r 

7 n 



+ 6i/3 j sjA i 2 (z)s j ^ 

where the last step uses the fact that 

(6.7) Pj = b 1 -b 1 0jtj. 

Lemma 2.7 in [2] then gives 



E 



n 



sjA- 2 (z) Sj --trAj 2 (z)T n 

j j n J 



< ^JZE-^tT AY 2 (z)T n Ai 2 (z)T n 



< 



'-EtrA^ 1 {z)Aj 1 (z) < 



rfiv 2 



M 

n 2y3 
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because, via (6.5), 

(6.8) -Etr A7 1 (z)A7 1 (z) = -Off -Eti AV^z)] < —. 

n v \n J v 

Using (6.3) and Lemma 2.7 in [2], we similarly have 



E 



1 n 



n . 

< + -^E\ trA~\z) - EtrA~ l (z)\ 2 . 
Summarizing the above, we have proven that 

M 



1 " ^ ) "4^1 tr A^iz) - £tr A" 1 ^)] 2 < 



n 2y3 



which implies Lemma 3 by choosing an appropriate Mi such that < i. 
□ 

Lemma 4. // |&i| < M, then, for v > Min -2 / 5 , 
(6.9) ^^ItrA-^^-StrA" 1 ^)! 4 ^ 



n 4 n 4 t; 6 

Proof. Lemma 4 is obtained by repeating the argument of Lemma 3 
and applying 

El -trA^(z)T n AY 2 (z)T r / 



n 



M 

D 

Lemma 5. If \b\\ < M, i/ien i/iere is some constant M 2 such that for 
v > M 2 n~ 2 / 5 , 

E\^\ 2 <M. 

Proof. By (6.7), we have 
and 

E\ii{z)\ 4 < ME\ m (z)\ 4 + Mn~ 4 E\ trA^(z)T n - Etr A^(z)T n \ 4 

(6.10) 

M M 

< ^7 + 



n 2 v 2 n 4 lfi 
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because repeating the argument of Lemma 3 and Lemma 4 yields 



-trDA^O) -E-trDA^O) 

n n 



4 M 

< 



n 



4 ?; 6 ||D|| 4 



(6.11) E 

for a fixed matrix D. It follows that 

£|/3i| 2 < l&il 2 + M'EU" + %l!(^| /3l |^| a |4 ) i/2 ) 

which gives 



nv nv 

Solving this inequality gives Lemma 5. □ 
6.3. A bound for b\(z) . By (6.7) and 

1 n 

(6.12) 1 - Cn - zcnm n {z) = -V 

j'=i 

(see the equality above (2.2) in [13]), we get 

(6.13) bi = 1 - c n - zc n Em n (z) + p n , 
where 

Pn = biE(Ml)- 

Lemma 6. If \b\\ < M, then there is some constant M3 such that for 
v > M 3 n~ 2 / 5 , 

M 

\Pn\ < • 

nv 

Proof. Lemma 5 and (6.10) ensure that 

\E\fh{z)Si{z)]\ = \bi(z)E\p 1 {z)Cf]\ < M{E\h{z)\ 2 E\^) 1 ' 2 < — . 

nv 

Thus, Lemma 6 is proved. □ 



Lemma 7. If Q(z + p n ) > 0, then there exists a positive constant c de- 
pending on 7, a, b such that 

|&i| <M. 

Proof. Consider the case $s(Em n (z)) >v>0 first. It follows from (6.13) 
and the assumption that 

3=(c n + z + zc n Em n (z) - 1) > 

= -\b 1 \ 2 %(l + n- 1 EtrA- 1 (z)). 
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Note that 

9(c n + z + zc n Em n (z) - 1) 

(6.14) =v + vc n / X . 2 dF n2 (x) 

J \x — z\ z 

= v + c n [vU(Em n (z)) + u%(Em n (z))] > 0. 

Thus, we have 

2 v + c n [v^(£'m n (z)) + u3(.Em n (;2))] 
oi S 



(6.15) < 



Ss(Em n (z)) 
[1 + c n \^(Em n (z))\ + c n u]Q(Em n (z)) 



CrSs{Em n (z)) 
< l/Cn +M + b. 
Next, consider the case Q(Em n (z)) < v. Note that for u S [a, b], 
,16) MEm n (z))\ > 



M + v l 

This, together with (6.15), gives 

2 (M + ^)[l + c n (|K(£Jm n (^)| +«)]« l + cJK^m^l+w] 
1 1 ~ c n Mv c n M u 

Lemma 8. There is some constant M4 such that, for any v > M^n -2 ^ 5 , 

%z + Pn )>0. 

Proof. First, we claim that 

(6.17) Z(z + p n )^0. 
If not, $s(z + p n ) =0 implies that 

(6.18) \Pn\>MPn)\=V. 

On the other hand, if $s(z + p n ) = 0, then we then conclude from Lemma 7 
and Lemma 6 that 

M 

\Pn\ < • 

nv 

Thus, recalling that v > M^n~ 2 ^ , we may choose an appropriate constant 
M4 so that 

v 

\Pn\ < 3, 

which contradicts (6.18). Therefore, (6.17) holds. 
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Next, note that 

< ^(z + zn' l EtiA'{ l (z)) >v, $s(z + zn~ x E tr A -1 (z)) >v. 
Therefore, when taking v = 1, 



\bi(z)\<^<M, 



\b(z)\ <^<M. 
v 



It follows from Lemma 7 and Lemma 6 that 

M 

\Pn\ < , 

n 

which implies that for n large and v = 1 , 
(6.19) 3?(z + p n )>0. 

This, together with (6.17) and continuity of the function, ensures that (6.19) 
holds for 1 > v > M^n~ 2 ^ . Thus, the proof of Lemma 8 is complete. □ 



6.4. Convergence of expected value. Based on Lemma 7 and Lemma 8, 
< M and therefore all results in Section 6.2 remain true for v > Mn~ 2 ^ 
with some appropriate positive constant M. 

Set F~ 1 (z) = (Em n T n + 1)" 1 and then write (see (5.2) in [2]) 



(6.20) 
where 



f dH n (t) 
J l + tEm n 



+ zc n E(m n (z)) = D n 



D n = Eh 



sjAj 1 (z)F- 1 (z)s 1 - -^(trF-^TVA- 1 ^)) 

n 



It follows that (see (3.20) in [2]) 
Em n {z) -m n (z) 



(6.21) 



: m n (z)Em n uj n I ^1 - c n Em n m n J — 



t 2 dH n {t) 



+ tEm n ){l + trn? n ) J' 



where u; n = —D n /Em n . 
Applying (6.7), we obtain 



D n = hE 

-E 



-tiF'^z^A^iz) - -trF-^T^A- 1 ^) 

n n 



WMi ( ^Al\z)F- l {z)s 1 - -EitiF^iz^nA-^z)) 

n 
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We now investigate D n . We conclude from (6.3) and Holder's inequality 
that 



(6.22) 



-tYY~ l (z)T n A7 l (z) - -trF-^VTnA^Cz 

n n 



< 



M (\ 



nifil 2 \ n 



■ixF~ 1 (z)F- 1 (z' 



1/2 



Let Ci = sjA^ 1 (z)F- 1 (z)s 1 - ±(tr F" 1 (z)T n Af 1 (z)). By (6.8) and Holder's 
inequality, we have 



\Eb 2 1 CxCl\ = \b 2 1 E Vl Ci\< 



M (\ 



nifll 2 \ n 



■trF" 1 (z)F" 1 (z) 



1/2 



and by Lemma 5, Lemma 4, (6.10), (6.23) and Holder's inequality, we have 



E\blMl(i 



^MWilV^M^ICilY 74 

+ M{E\f3 l \ 2 ) 1 / 2 

-tr A^(z)T n - £-trA7/ 1 (2:)T r 



< 



x [ 

M 



E(\Ci\ 2 \A^(z)) 



1/2 



nv 3/2 ' 

where we also use (6.11) and the fact that, via Lemma 2.11 in [2], 
(6.23) llF^z)!!^. 
These, together with (6.7), give 

|£Mi£iGI < |£&i6Ci| + \EbjMki\ 

M M fl 



(6.24) 



< 



+ 



nv 3/2 nifll 2 \ 
Similarly, by (6.11), we may get 



trF" 1 (z)F~ 1 (z) 



1/2 



^i/Sieif-trF-^^T^^^-^itrF-^^T.Ar 1 ^)') 



(6.25) 

In view of (6.22), we have 



< 



M 



nv 



3/2 ' 



E 



b l Mi(E-tTF- 1 {z)T n A^ 1 {z)-E-tiF~ 1 (z)T n A~ 1 {z) 
\ n n 



<^L(\ t , Y '\z)F-\z) 
nv\n 
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Summarizing the above gives 



.26) 



. . M M 
\D n \ < + 



itrP-^^F- 1 ^) J 
n J 



1/2 



nv 3 / 2 nv 3 / 2 

Now, considering the imaginary part of (6.20), we may conclude that 
t dHn{t) ^zc^E{m n {z)))\ + \D n \ 



(6-27) \ 1+tEm j2— 3 (£mJ ' 3(EmJ' 

Formulas (6.16), (6.2) and an equality similar to (1.5) ensure that 
\%(zc n E{m n (z)))\ ^ u%Em n )+v\K{Em n )\ 



(6.28) 
and that 



< 



\D r , 



^s(Em ri 
It follows that 



< 



M 



.29) 



nv 



t 



5/2 



M 

5/2 



3?(£m n ) 

-tiF~\z)F~ 1 (z)] 
n J 



< M 



1/2 



|1 +t£mJ 2 



dff n (t) 



< M + J* + J* ( I trF-^^F-^f) 



1/2 



which implies that 



-trF-^F -1 ^) 
n 



This inequality yields 



dff n (i) 



< 



|1 + tEm n \ 2 ~ A 



M M 



tdH n {t) 
\l+tEm n \ 2 



nv 



5/2 



5/2 



/ i \ 1/2 

( -trF -1 ^- 1 ^)] . 



(6.30) 



n 



trF~ 1 (z)F^ 1 (z) 



< M. 



This, together with (6.26), ensures that 



\D n \ < 



M 



nv 



3/2 ' 



(6.31) 

Next, we prove that 

(6.32) ini\Em n (z)\ > M > 0. 

n,z 

To this end, by (6.13) and an equality similar to (1.5), we have 

(6.33) b\ = -zEm n (z) + p n . 
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In view of Lemma 6 and (6.33), to prove (6.32), it is thus sufficient to show 
that 



(6.34) 



n 



EtrA^ l (z)T r 



< M. 



Suppose that (6.34) is not true. There then exist subsequences and Zk — > 
zq ^ such that \^Eti Aj" 1 (z)T n ,| — > oo on the subsequences nk and Zk, 
which, together with (6.33) and Lemma 6, implies that Em n (z) — > on such 
subsequences. This, together with (6.30), ensures that on such subsequences 

dH n (t) 



1 + tEm n 

which, via an equality similar to (1.5), further implies that on such subse- 
quences, 

dH n {t) 



.35) 



+ zc n E{m n (z)) 



But, on the other hand, by (6.31) and (6.20), 

dH n (t) 



1. 



1 + tEm r 



+ zc n E(m n (z)) -> 0, 



which contradicts (6.35). Therefore, (6.34) and, consequently, (6.32) hold. 
It follows from (6.32) and (6.31) that for v > M 8 n" 2 / 5 , 



.36) 



M 



nv 



3/2 



where we may choose an appropriate Ms. Moreover, since (1.6) holds when 
m is replaced by m^, considering the imaginary parts of both sides of the 
equality, we obtain 

t 2 dH n (t) 



v = , „, n - c n 9(m^) 



12 



II + trr£\ 2 ' 



which implies that 



It follows that 



t 2 dH n (t) 
II + tm0| 2 



t 2 dH n (t) 
|l + im°| 2 



<M. 



t 2 dH n (t) 
II + tmPA 2 



1/2 



< 1 - Mv. 



Applying this and (6.36), as in (3.21) in [2], we may conclude that 
(6.37) 1 - c n Em„m°„ [ .l^"^ . > Mv. 



(l + tEm n )(l+tm°) 
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This, together with (6.21) and (6.31), yields 
(6.38) \Em n (z)-m°(z)\< 



77/U 5 / 2 

6.5. Convergence rate of EF An and F n . As in Theorem 1.1 in [14], 
f Cn ,H n is continuous. Therefore, 

— sup / \F CnjHn (x + y)-F Cn , Hn (x)\dy<Mv, 

V7T x£[a+l/2e,b-l/2e] J\y\<2vM 

where e > vM\\. Lemma 2.1 in [5] or Lemma 2.1 and Corollary 2.2 in [6] are 
then applicable in our case. 

First, consider EF An . For v > Vq, by Corollary 2.2 in [6], (6.38), we obtain, 
after integration in u and v, 

M Mm 

(6.39) A n < — + M 9 v + , 

n nv 

where we set V, given in Corollary 2.2 in [6], equal to one and also use 
the fact that \Em n {z') — m°(z')| = 0{n~ l ) with z' = u + iV (see Section 4 
in [3]). If v = Min~ 2 / 5 , then (6.39) gives |A n | < M/n 2/5 . If v = jA n , then 
we choose 7 = (2Mg) _1 (here one should note that M10 depends on 7, but 
Mq does not depend on 7). Again, (6.39) gives 

(6.40) |A n |<M/n 2 / 5 . 

This completes the proof of (2.10). 

Now, consider the convergence rate of F An . It follows from Cauchy's 
inequality that 

n~ l \ tr A~ 2 (z) - Eti A~ 2 (z)| < — sup n~ l \ tr A -1 ^) - Etv A _1 (zi)|, 

v Zl ec v 

where C v = {z\ : \ z — z\ \ = uo/3}. This, together with Lemma 3, ensures that 

M 



(6.41) En -1 |trA~ 2 (z) - J BtrA" 2 (z)| < 

Equation (2.11) then follows from (6.41), Lemma 3, the argument leading 
to (6.40) and Lemma 7.1 in [6]. 
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